基于机器学习(ML)的智能仪表数据分析对于先进的计量基础设施(AMI)中的能源管理和需求 - 响应应用非常有前途。开发AMI的分布式ML应用程序中的一个关键挑战是保留用户隐私,同时允许有效的最终用户参与。本文解决了这一挑战,并为AMI中的ML应用程序提出了隐私保留的联合学习框架。我们将每个智能仪表视为托管使用中央聚合器或数据集中器的信息的ML应用程序的联邦边缘设备。而不是传输智能仪表感测的原始数据,ML模型权重被传送到聚合器以保护隐私。聚合器处理这些参数以设计可以在每个边缘设备处替换的鲁棒ML模型。我们还讨论了在共享ML模型参数的同时提高隐私和提高通信效率的策略,适用于AMI中的网络连接相对较慢。我们展示了在联合案例联盟ML(FML)应用程序上的提议框架,其提高了短期负荷预测(STLF)。我们使用长期内存(LSTM)经常性神经网络(RNN)模型进行STLF。在我们的体系结构中,我们假设有一个聚合器连接到一组智能电表。聚合器使用从联合智能仪表接收的学习模型渐变,以生成聚合,鲁棒RNN模型,其提高了个人和聚合STLF的预测精度。我们的结果表明,通过FML,预测精度增加,同时保留最终用户的数据隐私。
translated by 谷歌翻译
t-SNE remains one of the most popular embedding techniques for visualizing high-dimensional data. Most standard packages of t-SNE, such as scikit-learn, use the Barnes-Hut t-SNE (BH t-SNE) algorithm for large datasets. However, existing CPU implementations of this algorithm are inefficient. In this work, we accelerate the BH t-SNE on CPUs via cache optimizations, SIMD, parallelizing sequential steps, and improving parallelization of multithreaded steps. Our implementation (Acc-t-SNE) is up to 261x and 4x faster than scikit-learn and the state-of-the-art BH t-SNE implementation from daal4py, respectively, on a 32-core Intel(R) Icelake cloud instance.
translated by 谷歌翻译
Targeted syntactic evaluations of language models ask whether models show stable preferences for syntactically acceptable content over minimal-pair unacceptable inputs. Most targeted syntactic evaluation datasets ask models to make these judgements with just a single context-free sentence as input. This does not match language models' training regime, in which input sentences are always highly contextualized by the surrounding corpus. This mismatch raises an important question: how robust are models' syntactic judgements in different contexts? In this paper, we investigate the stability of language models' performance on targeted syntactic evaluations as we vary properties of the input context: the length of the context, the types of syntactic phenomena it contains, and whether or not there are violations of grammaticality. We find that model judgements are generally robust when placed in randomly sampled linguistic contexts. However, they are substantially unstable for contexts containing syntactic structures matching those in the critical test content. Among all tested models (GPT-2 and five variants of OPT), we significantly improve models' judgements by providing contexts with matching syntactic structures, and conversely significantly worsen them using unacceptable contexts with matching but violated syntactic structures. This effect is amplified by the length of the context, except for unrelated inputs. We show that these changes in model performance are not explainable by simple features matching the context and the test inputs, such as lexical overlap and dependency overlap. This sensitivity to highly specific syntactic features of the context can only be explained by the models' implicit in-context learning abilities.
translated by 谷歌翻译
We introduce LaViLa, a new approach to learning video-language representations by leveraging Large Language Models (LLMs). We repurpose pre-trained LLMs to be conditioned on visual input, and finetune them to create automatic video narrators. Our auto-generated narrations offer a number of advantages, including dense coverage of long videos, better temporal synchronization of the visual information and text, and much higher diversity of text. The video-text embedding learned contrastively with these additional auto-generated narrations outperforms the previous state-of-the-art on multiple first-person and third-person video tasks, both in zero-shot and finetuned setups. Most notably, LaViLa obtains an absolute gain of 10.1% on EGTEA classification and 5.9% Epic-Kitchens-100 multi-instance retrieval benchmarks. Furthermore, LaViLa trained with only half the narrations from the Ego4D dataset outperforms baseline models trained on the full set, and shows positive scaling behavior on increasing pre-training data and model size.
translated by 谷歌翻译
Training Graph Neural Networks, on graphs containing billions of vertices and edges, at scale using minibatch sampling poses a key challenge: strong-scaling graphs and training examples results in lower compute and higher communication volume and potential performance loss. DistGNN-MB employs a novel Historical Embedding Cache combined with compute-communication overlap to address this challenge. On a 32-node (64-socket) cluster of $3^{rd}$ generation Intel Xeon Scalable Processors with 36 cores per socket, DistGNN-MB trains 3-layer GraphSAGE and GAT models on OGBN-Papers100M to convergence with epoch times of 2 seconds and 4.9 seconds, respectively, on 32 compute nodes. At this scale, DistGNN-MB trains GraphSAGE 5.2x faster than the widely-used DistDGL. DistGNN-MB trains GraphSAGE and GAT 10x and 17.2x faster, respectively, as compute nodes scale from 2 to 32.
translated by 谷歌翻译
Damage to the inferior frontal gyrus (Broca's area) can cause agrammatic aphasia wherein patients, although able to comprehend, lack the ability to form complete sentences. This inability leads to communication gaps which cause difficulties in their daily lives. The usage of assistive devices can help in mitigating these issues and enable the patients to communicate effectively. However, due to lack of large scale studies of linguistic deficits in aphasia, research on such assistive technology is relatively limited. In this work, we present two contributions that aim to re-initiate research and development in this field. Firstly, we propose a model that uses linguistic features from small scale studies on aphasia patients and generates large scale datasets of synthetic aphasic utterances from grammatically correct datasets. We show that the mean length of utterance, the noun/verb ratio, and the simple/complex sentence ratio of our synthetic datasets correspond to the reported features of aphasic speech. Further, we demonstrate how the synthetic datasets may be utilized to develop assistive devices for aphasia patients. The pre-trained T5 transformer is fine-tuned using the generated dataset to suggest 5 corrected sentences given an aphasic utterance as input. We evaluate the efficacy of the T5 model using the BLEU and cosine semantic similarity scores. Affirming results with BLEU score of 0.827/1.00 and semantic similarity of 0.904/1.00 were obtained. These results provide a strong foundation for the concept that a synthetic dataset based on small scale studies on aphasia can be used to develop effective assistive technology.
translated by 谷歌翻译
人们依靠新闻来了解世界各地正在发生的事情并告知他们的日常生活。在当今的世界中,当假新闻的扩散猖ramp时,拥有大规模且高质量的真实新闻文章来源,其中包含出版类别的信息对于学习真实新闻的自然语言语法和语义是有价值的。作为这项工作的一部分,我们提供了一个新闻类别数据集,其中包含从HuffPost获得的2012年至2018年的200K新闻头条,以及有用的元数据以实现各种NLP任务。在本文中,我们还从数据集中产生了一些新颖的见解,并描述了数据集的各种现有和潜在应用。
translated by 谷歌翻译
我们提出了一个基于强化的学习框架,用于自动发现在脂肪机器人群的任何初始配置中可用的模式。特别是,我们对脂肪机器人群中无碰撞收集和相互可见性的问题进行了建模,并发现使用我们的框架来解决它们的模式。我们表明,通过根据某些约束(例如相互可见性和安全接口)来塑造奖励信号,机器人可以发现无碰撞的轨迹,导致形成良好的聚集和可见性模式。
translated by 谷歌翻译
在此演示论文中,我们设计和原型Rhythmedge是一种低成本,基于深度学习的无接触系统,用于常规的HR监控应用。通过促进无接触性质,实时/离线操作,廉价和可用的传感组件以及计算设备,节奏对现有方法的好处。我们的Rhythmedge系统是可移植的,可以轻松部署,以在中等控制的室内或室外环境中可靠的人力资源估计。 Rhythmedge通过检测面部视频(远程光摄影学; RPPG)的血量变化来测量人力资源,并使用现成的市售资源可限制的边缘平台和摄像机进行即时评估。我们通过将Rhythmedge的可伸缩性,灵活性和兼容性部署到不同的体系结构的三个资源约束平台上(Nvidia Jetson Nano,Google Coral Development Board,Raspberry Pi)和三个异质摄像机,可与不同的体系结构进行部署,并证明了Rhythmedge的可伸缩性和兼容性。摄像头,动作摄像头和DSLR)。 Rhythmedge进一步存储纵向心血管信息,并为用户提供即时通知。我们通过分析其运行时,内存和功率使用情况来彻底测试三个边缘计算平台的原型稳定性,延迟和可行性。
translated by 谷歌翻译
一个沿着城市街道行走的人试图对世界各个方面进行建模,这很快就会被许多商店,汽车和人们遵循自己的复杂且难以理解的动态所淹没。在这种环境中的探索和导航是一项日常任务,不需要大量精神资源。是否可以将这种感官信息的消防软管转变为最小的潜在状态,这是代理在世界上成功采取行动的必要和足够的?我们具体地提出了这个问题,并提出了可控制的状态发现算法(AC-State),该算法具有理论保证,并且实际上被证明可以发现\ textit {最小可控的潜在状态},其中包含所有用于控制控制的信息代理,同时完全丢弃所有无关的信息。该算法由一个具有信息瓶颈的多步逆模型(预测遥远观察结果的动作)组成。 AC-State可以在没有奖励或示威的情况下实现本地化,探索和导航。我们证明了在三个领域中发现可控潜在状态的发现:将机器人组分散注意力(例如,照明条件和背景变化),与其他代理商一起在迷宫中进行探索,并在Matterport House Simulator中导航。
translated by 谷歌翻译